Syllable-based acoustic modeling for Japanese spontaneous speech recognition
نویسندگان
چکیده
We study on a syllable-based acoustic modeling method for Japanese spontaneous speech recognition. Traditionally, mora-based acoustic models have been adopted for Japanese read speech recognition systems. In this paper, syllable-based unit and mora-based unit are clearly distinguished in their definition, and syllables are shown to be more suitable as an acoustic model for Japanese spontaneous speech recognition. In spontaneous speech, a vowel lengthening occurs frequently, and recognition accuracy is greatly affected by this phenomena. From this viewpoint, we propose an acoustic modeling technique that explicitly incorporates the vowel lengthening in syllable-based HMMs. Experimental results showed that the proposed model could exceed the performance of conventionally used cross-word triphone model and mora-based model in Japanese spontaneous speech recognition task.
منابع مشابه
Acoustic modeling for spontaneous speech recognition using syllable dependent models
This paper proposes a syllable context dependent model for spontaneous speech recognition. It is generally assumed that, since spontaneous speech is greatly affected by coarticulation, an acoustic model featuring a longer range phonemic context is required to achieve a high degree of recognition accuracy. This motivated the authors to investigate a tri-syllable model that takes differences in t...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملModeling Frequent Allophones in Jap
In this paper, we describe a technique to model frequent allophones in Japanese speech recognition. The Consonant-Vowel syllabic structure (CV) is dominant in Japanese. Based on frequency, the distribution of CV pairs is rather skewed. Isolating out the most frequent allophones through the use of additional phonemes in acoustic modeling can achieve better recognition accuracy. By introducing te...
متن کاملTowards the creation of acoustic models for stressed Japanese speech
In error recovery utterance, the user using the speech recognition system changes his or her speaking style to aid the system in recognizing the speech. However, this change leads the mismatch between the acoustic models and reduces the performance of the system. This degradation causes a serious problem of speech recognition for a dialog system or a speech translation system. In error recovery...
متن کاملRecent Progress in Corpus-Based Spontaneous Speech Recognition
This paper overviews recent progress in the development of corpus-based spontaneous speech recognition technology. Although speech is in almost any situation spontaneous, recognition of spontaneous speech is an area which has only recently emerged in the field of automatic speech recognition. Broadening the application of speech recognition depends crucially on raising recognition performance f...
متن کامل